FORGe at SemEval-2017 Task 9: Deep sentence generation based on a sequence of graph transducers

نویسندگان

  • Simon Mille
  • Roberto Carlini
  • Alicia Burga
  • Leo Wanner
چکیده

We present the contribution of Universitat Pompeu Fabra’s NLP group to the SemEval Task 9.2 (AMR-to-English Generation). The proposed generation pipeline comprises: (i) a series of rule-based graphtransducers for the syntacticization of the input graphs and the resolution of morphological agreements, and (ii) an off-theshelf statistical linearization component. 1 Setup of the system The generator we presented for Task 9.2 of SemEval is a pipeline of graph transducers called Fabra Open Rule-based Generator (FORGe).1 It is built upon work presented, e.g., in (Bohnet, 2006; Wanner et al., 2010). It can be also considered an extended rule-based version of (Ballesteros et al., 2015). The current generator has been mainly developed on the dependency Penn Treebank (Johansson and Nugues, 2007) automatically converted to predicate-argument structures, and adapted to the AMR inputs using SemEval’s training and evaluation sets. 1.1 Overview of the pipeline The core of the generator is rule-based: graph transduction grammars convert, in several steps, abstract AMRs into syntactic structures that contain all the morphological features needed to retrieve the final forms of all the words. The syntactic structures are then linearized with an off-theshelf tool, and finally the final forms of the words are retrieved. Our generator follows the theoretical model of the Meaning-Text Theory (Mel’čuk, 1988); the names of the intermediate layers used in Table 1 come from the MTT terminology. A slightly updated version of the submitted system can be found at https://www.upf.edu/web/taln/ resources Step Layermtt #rul. 0 Conversion of AMRs format ConS N/A into CoNLL’09 format 1 Mapping of AMRs onto SemS 190 predicate-argument graphs 2 Assignment of parts SemSpos 96 of speech 3 Derivation of deep syntactic DSyntS 267 structure 4 Introduction of function SSyntS 294 words 5 Resolution of agreements DMorphS 85 6 Linearization SMorphS N/A 7 Retrieval of surface forms Text 1 8 Post-processing Textfinal 4 Table 1: Overview of the AMR-to-text pipeline. 1.2 Input format conversion Since our generator cannot read the provided format, we converted the input AMRs to the CoNLL’09 format (Hajič et al., 2009). We assume that each sentence in the original file has two components: a three-line comment with some metadata (id, date, original sentence, etc.) and the AMR tree. In order to map the AMR tree to CoNLL, we assume that: (i) each node of the tree will be defined as either (a) slash-separated variable-value pair, (b) only the variable name, or (c) only the value, and (ii) each branch will be defined by a relation name preceded by a colon. 2 Generation from AMRs There are 932 activated graph-transduction rules in the pipeline; Steps 0 and 1 are AMR-specific, while in the rest of the grammars, only one rule is AMR-specific rule (at Step 3). 2.1 Mapping of AMRs onto predicate-argument graphs The mapping produces graphs that contain linguistic information only, which includes meaning bearing units and the following types of roles:

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sheffield at SemEval-2017 Task 9: Transition-based language generation from AMR

This paper describes the submission by the University of Sheffield to the SemEval 2017 Abstract Meaning Representation Parsing and Generation task (SemEval 2017 Task 9, Subtask 2). We cast language generation from AMR as a sequence of actions (e.g., insert/remove/rename edges and nodes) that progressively transform the AMR graph into a dependency parse tree. This transition-based approach relie...

متن کامل

Oxford at SemEval-2017 Task 9: Neural AMR Parsing with Pointer-Augmented Attention

We present an end-to-end neural encoderdecoder AMR parser that extends an attention-based model by predicting the alignment between graph nodes and sentence tokens explicitly with a pointer mechanism. Candidate lemmas are predicted as a pre-processing step so that the lemmas of lexical concepts, as well as constant strings, are factored out of the graph linearization and recovered through the p...

متن کامل

Neobility at SemEval-2017 Task 1: An Attention-based Sentence Similarity Model

This paper describes a neural-network model which performed competitively (top 6) at the SemEval 2017 cross-lingual Semantic Textual Similarity (STS) task. Our system employs an attention-based recurrent neural network model that optimizes the sentence similarity. In this paper, we describe our participation in the multilingual STS task which measures similarity across English, Spanish, and Ara...

متن کامل

ECNU at SemEval-2017 Task 1: Leverage Kernel-based Traditional NLP features and Neural Networks to Build a Universal Model for Multilingual and Cross-lingual Semantic Textual Similarity

To model semantic similarity for multilingual and cross-lingual sentence pairs, we first translate foreign languages into English, and then build an efficient monolingual English system with multiple NLP features. Our system is further supported by deep learning models and our best run achieves the mean Pearson correlation 73.16% in primary track.

متن کامل

FCICU at SemEval-2017 Task 1: Sense-Based Language Independent Semantic Textual Similarity Approach

This paper describes FCICU team systems that participated in SemEval-2017 Semantic Textual Similarity task (Task1) for monolingual and cross-lingual sentence pairs. A sense-based language independent textual similarity approach is presented, in which a proposed alignment similarity method coupled with new usage of a semantic network (BabelNet) is used. Additionally, a previously proposed integr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017